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Abstract 

Objectives: It is thought that a proportion of the genetic sus- 
ceptibility to complex diseases is due to low-frequency and 
rare variants. Next-generation sequencing in large popula- 
tions facilitates the detection of rare variant associations to 
disease risk. In order to achieve adequate power to detect 
association at low-frequency and rare variants, locus-specif- 
ic statistical methods are being developed that combine in- 
formation across variants within afunctional unit and test for 
association with this enriched signal through so-called bur- 
den tests. Methods: We propose a hierarchical clustering ap- 
proach and a similarity kernel-based association test for con- 
tinuous phenotypes. This method clusters individuals into 
groups, within which samples are assumed to be genetically 
similar, and subsequently tests the group effects among the 
different clusters. Results: The power of this approach is 
comparable to that of collapsing methods when causal vari- 
ants have the same direction of effect, but its power is sig- 
nificantly higher compared to burden tests when both pro- 
tective and risk variants are present in the region of interest. 



Overall, we observe that the Sequence Kernel Association 
Test (SKAT) is the most powerful approach under the allelic 
architectures considered. Conclusions: In our overall com- 
parison, we find the analytical framework within which SKAT 
operates to yield higher power and to control type I error 

appropriately. Copyright © 201 3 S. Karger AG, Basel 



Introduction 

Large-scale genome-wide association studies have 
been successful in identifying common variants influ- 
encing complex traits. Although these findings have im- 
proved our understanding of the genetic basis of many 
complex traits, for most of the traits they explain only a 
fraction of heritability. This observation supports the 
long established idea that low- frequency and rare variants 
play an important role in common diseases. Thanks to 
recent advances in sequencing and genotyping technol- 
ogy and to large collaborative efforts that study human 
sequence variation and the allelic architecture of disease, 
such as the 1000 Genomes Project (www.lOOOgenomes. 
org) [1] and the UK10K project (www.ukl0k.org), we are 
now able to study the contribution of low- frequency and 
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rare variants in complex traits. The different strategies in 
place today to search for low- frequency and rare variants 
affecting complex traits largely fall under 3 categories: ge- 
nome-wide association studies using dense genotyping 
platforms (for example, Illumina 2.5M and the exome 
chip), imputation, and next-generation sequencing. Here 
we focus on the analysis of data generated by next-gener- 
ation sequencing, although the methods presented here 
are applicable to data from the other categories. We also 
discuss ways of extending the methods to allow for geno- 
type uncertainty, which can arise from imputation or 
low-coverage next-generation sequencing. 

Single-point analysis of low-frequency (typically de- 
fined as those with minor allele frequency (MAF) be- 
tween 1 and 5%) or rare variants (those with MAF <1%) 
is under-powered, as not enough copies of the minor al- 
lele are observed. An alternative approach is to use meth- 
ods that combine information across multiple low-fre- 
quency or rare variant sites within a region, which can be 
a gene or any other functional genomic region. Currently, 
there are more than thirty methods developed for rare 
variant burden analysis, which use different approaches 
to combine information across multiple variants within 
the region of interest. One group of tests consists of col- 
lapsing methods based on summary statistics (the Cohort 
Allelic Sums Test [2], Combined Multivariate and Col- 
lapsing method [3], Weighted Sum Test [4], Variable- 
Threshold approach [5] ), another group of tests are based 
on similarities among individual sequences (the Kernel- 
Based Association Test [6], Sequence Kernel Association 
Test [7] - SKAT), while other methods are based on re- 
gression models that use collapsed sets of variants and 
other factors as predictors (collapsing using proportion 
of rare variants [8], adaptive sums [9], LASSO or ridge 
regression-based approaches [10]). Collapsing tests ag- 
gregate information across multiple variants into a single 
quantity, which is then used to test for association of dis- 
ease with an accumulation of rare minor alleles, whereas 
methods based on similarities among individual sequenc- 
es are multi-marker tests that combine single-variant test 
statistics. Methods that use the regression model have the 
advantage that they can easily adjust for covariates and 
can easily handle both a continuous and case-control out- 
come. 

Collapsing methods vary in the way they collapse low- 
frequency/rare variants and in their chosen test statistic. 
A popular burden test was proposed by Morris and Zeg- 
gini [8] , which models a continuous or binary phenotype 
as a function of the proportion of low-frequency/rare 
variants at which an individual carries a minor allele 



within a regression framework. This model implicitly as- 
sumes that all collapsed variants are associated with dis- 
ease, and that they have the same direction of effect, which 
can be either deleterious or protective. Moreover, it is not 
robust to linkage disequilibrium (LD), as correlation be- 
tween the collapsed variants can lead to inflation of the 
test statistic. However, we do not expect this to be a wor- 
ry for the analysis of rare variants, as we expect there to 
be relatively limited LD between them, because they are 
likely to have occurred relatively recently in the ancestry 
of the population. A different kind of approach is SKAT 
[7], which is based on a weighted linear kernel function 
that measures the genetic similarity among the study 
samples. A continuous or binary phenotype is also mod- 
elled within a regression framework, but as variants are 
not collapsed together, it allows each variant to have its 
own direction and magnitude of effect or even no effect. 

Another approach that uses similarities among indi- 
vidual sequences is the Kernel- Based Association Test 
(KBAT) proposed by Mukhopadhyay et al. [6]. This 
method focuses on a case-control phenotype and mea- 
sures the genetic similarity for each pair within cases and 
controls at each low-frequency/rare variant by counting 
the common alleles between the genotypes of two indi- 
viduals. This scoring scheme is based on the Allele Match 
(AM) kernel, and the similarity scores between individu- 
als are modelled via a one-way ANOVA model that tests 
whether the group effects are the same for cases and con- 
trols. KBAT is a nonparametric test that makes no as- 
sumptions about the direction of individual SNP effects 
and is robust to LD. Asimit et al. [11] extended KBAT to 
account for genotype uncertainty of sequence- derived 
data in a software called AMELIA (Allele Matching Em- 
pirical Locus- specific Integrated Association test), but the 
method is only applicable for a case/ control outcome. In- 
corporating sequence-derived variant calling and geno- 
type probability uncertainty into the analysis of sequence- 
based datasets is important, because in sequencing data- 
sets there is uncertainty surrounding variant calling 
(which is more pronounced in sequencing projects at low 
depths), while genotype misclassification is more of a 
concern compared to datasets arising from genotyping 
platforms. These types of risks can lead to loss of power 
for subsequent association studies, which is illustrated in 
the results of Asimit et al. [11], according to which AME- 
LIA is consistently more powerful than its unweighted 
equivalent KBAT. 

Here we focus on a continuous phenotype and we ex- 
plore a hierarchical clustering and kernel-based associa- 
tion test. Specifically, we suggest using the AM kernel to 
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cluster individuals into disjoint clusters and test if the 
means of the phenotypes within each cluster are the same 
via ANOVA. We use a simple hierarchical clustering al- 
gorithm, where our distance matrix is one minus a stan- 
dardized AM similarity score matrix. To investigate the 
performance of the proposed method and its power and 
type I error, we performed an extensive simulation study 
with varying sample sizes, different magnitudes of genet- 
ic signal and different allelic architectures and compared 
it to two of the most popularly used rare variant analysis 
methods. We find that its power is comparable to that of 
the collapsing test based on the proportion of low-fre- 
quency/ rare variants when causal variants have the same 
direction of effect, but its power is almost double than 
that of the burden test when both protective and risk vari- 
ants exist. SKAT is, however, the clear winner in both al- 
lelic architecture scenarios. 

Methods 

Collapsing Methods 

Morris and Zeggini [8] introduced a collapsing method based 
on the proportion of low-frequency/rare variants at which an in- 
dividual carries a minor allele. If «, is the number of low- frequen- 
cy/rare variants present in the r'-th individual with a quantitative 
phenotype y b then for the analysis of a quantitative trait this col- 
lapsing test is implemented as 



where ft is the predicted mean of y under H 0 . For a study of n in- 
dividuals, Kis an n x n matrix whose ((', (') element is given by the 
weighted kernel function 



(2) 



n. 



(1) 



where A is the increase in the phenotype value for an individual 
carrying minor alleles at r ; low- frequency/rare loci compared to an 
individual carrying none, /$ denote regression coefficients for a h k (g k ,g k . ) 
vector of covariates x„ and e ; ~ N(0,o E ). Association of an accumu- 
lation of low-frequency/rare variants with the phenotype can be 
tested by examining the null hypothesis H 0 : A = 0 by using a likeli 
hood ratio test. This method has been implemented in the GRAN 
VIL software (http://www.well.ox.ac.uk/GRANVIL). 



where G is an n x p matrix with (;', j) element being the genotype 
of variant j of individual i. 

SKAT puts more weight on rare variants and non-zero weight 
to variants with MAF 1-5%. This is achieved by assuming that 
VJJi)j) has magnitude given by the Beta density function Beta(M AF ; -; 
«i, a 2 ) evaluated at the MAF of variant j and with parameters «i = 
1 and a 2 = 25. Other choices for weights are also discussed in Wu 
etal. [7]. 

Under H 0 , Q follows a mixture of x 2 distributions (please refer 
to Wu et al. [7] for an explanation of how this is approximated). 
SKAT can be applied to all variants across the region of interest, 
but we have limited it to low-frequency or rare variants to make 
results comparable to the rest of the methods we tested. 

A Hierarchical Clustering and Kernel-Based Association Test 
We propose a simple hierarchical clustering approach, where 
the individuals in the study are allocated to distinct clusters ac- 
cording to their genotype similarity across the region of interest. 
The similarity measure we employ is the AM kernel score sug- 
gested by Mukhopadhyay et al. [6]. The AM kernel is the number 
of alleles shared between two individuals at a single locus. It is 
flexible, as it makes no assumptions on the risk allele of each vari- 
ant. 

We then use the ANOVA model to test if the means of the phe- 
notypes within each cluster differ significantly. Specifically, for any 
pair (i,j) of individuals (' t j in the study with sample size n and for 
each low- frequency/rare variant k in the region of interest span- 
ning p low-frequency/rare variants, the AM kernel assigns a simi- 
larity score 



4. iff{ 

2. if 4=1.4^ {0.2} or 4 = 1,4 e{0,2} 
0 , otherwise 



(4) 



Sequence Kernel Association Test 

SKAT [7] is implemented within a multiple regression frame- 
work, where phenotype y t of sample i is modelled as 



where g§ and gh are the number of minor allele individuals and 
j carry at low-frequency/rare variant k. For any pair (i, j) of indi- 
viduals i t j, the AM score across the region of interest is simply 
the sum of the single-SNV AM scores, i.e. 



y t = a 0 + axi + [Sgj + e ; , 



(2) 



where /} is a vector of regression coefficients for genotypes g, = (g a , 
g ip ) that individual i carries at p variants within the region, and 
gij = 0, 1 , or 2 is the number of copies of the minor allele. Moreover, 
a are regression coefficients for a vector of covariates x h and £, ~ 
N(0,o E ). 

SKAT assumes that each Bj follows an arbitrary distribution 
centered at zero with variance ayr, where r is a variance component 
and tOj is a prespecified weight for variant j. To test for association 
of the variants with the phenotype, SKAT tests H Q : r = 0 by employ- 
ing a variance component score statistic given by 



Q=(y-ib'KHy-fi), 



(3) 



We perform a hierarchical cluster analysis using the AM score 
to group individuals according to their genotype similarity or 
rather according to their genotype dissimilarity, which is repre- 
sented by the n x n D matrix with (;', j) element being d(m = 1 - 
g(/))/4p. Hierarchical clustering works by initially assigning 
each individual to their own cluster and then the algorithm pro- 
ceeds iteratively at each stage joining the two most similar clusters, 
continuing until there is just a single cluster. At each stage, dis- 
tances between clusters are recomputed according to the particular 
clustering method being used. We use Ward's hierarchical cluster- 
ing, which minimizes the information loss (defined by Ward in 
terms of an error sum-of-squares criterion) associated with each 
grouping. Hierarchical cluster analysis can be readily implemented 
by using the standard statistics package R, and returns a dendo- 
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gram, i.e. a tree whose branches represent the different individuals, 
so that genetically 'close' individuals appear in the same or neigh- 
bouring branches. A caveat when using hierarchical clustering is 
that in order to allocate samples to groups, the user needs to 'cut' 
the tree at a prespecified height, or in other words, to prespecify 
the number of clusters. 

If the number of clusters is specified to L, the study samples are 
allocated to these L distinct clusters according to their position in 
the dendogram. We then model the phenotype yn of individual i = 
1, n belonging to cluster I = 1, L using a one-way ANOVA 
model as 

yn = [i + ai + £,-, (5) 

where \x is the general effect of individuals, «/ is the cluster-specif- 
ic treatment effect, and e ; is the error component. If individuals' 
phenotypes between clusters are similar, then a l = — = a L , and 
therefore H 0 for testing disease association is a 1 = — = a L = 0. 
The within-group sum of squares is 

SSW = ELK'Jy* ~ U.f'^ere U, = ^y,, I », 

and «; denotes the number of individuals in group I. The between- 
group sum of squares is 

SSB = Y^ L l= ni(U,-U)\whereU = (U 1 + --- + U L )/n. 
To test association with disease, the ANOVA statistic is given 

by 

SSB/(L-l) 

F = i - (6) 

SSW/(n-L) 

which follows an F -distribution with (L - 1) and (n - L) degrees 
of freedom. 

We vary the number of clusters from 1 to 20, and we repeat the 
ANOVA modelling step, which is very fast, and report the maxi- 
mum test statistic. We find that using an upper threshold of 20 for 
the possible number of clusters is sufficient to capture the genetic 
diversity of the study samples, by examining the average number 
of clusters that correspond to the maximum test statistic in our 
simulation study (e.g. for a simulation scenario with 3,000 indi- 
viduals, the mean and median value was 8.7 and 6, respectively, 
across 1,000 replications). To assign significance to the associa- 
tion test, we perform 10,000 permutations of the phenotype to 
individuals and we repeat the ANOVA step with the number of 
clusters ranging from 1 to 20, without the need to repeat the hier- 
archical clustering. The p value of significance of the association 
test is the proportion of permutations that resulted in a maximum 
test statistic bigger than or equal to the original maximum test 
statistic. 

It is possible to down-weight variants that are not thought to 
be disease-associated and up-weight variants that are believed to 
be disease-associated, so that the contribution of the latter to the 
AM score across the region of interest is higher. We have therefore 
implemented a version of the model that incorporates weights. In 
this instance, the AM score across the region of interest is given 
by 

*'(^.«w)=e:. 1 ^(4»4)' 



where to^ is the weight of variant k, and the dissimilarity matrix 
D' has element equal to 

% =i-*Wf W )'E:.,4-*- 

The un-weighted version of the proposed approach can be 
viewed as the weighted version with weights to^ = 1 for all k = 1, 
p. Different weighing schemes are possible, and in our simulation 
study, we used the one proposed by Wu et al. [7] , where V(w^) has 
a magnitude given by the Beta density function Beta(MAFj-; 1, 25) 
evaluated at the MAF of variant k (please see above), so that more 
weight is assigned to rarer variants. 

The method is implemented in an R- and C-based software 
called KATE (Kernel Association Test Extended). KATE is freely 
available at http://www.sanger.ac.uk/resources/software/kate/. 

Simulation Study 

We conducted an extensive simulation study to examine the 
type I error and power of the proposed method, and to compare 
its performance to other popular rare variant methods under dif- 
ferent allelic architectures. Haplotype data were simulated under 
the null model of no genetic association for 2N individuals by using 
the HAPGEN2 software [ 12], so that the allele frequency spectrum 
and LD structure is the same with the 1000 Genomes haplotype 
data (Phase I interim June 2011 release in NCBI build 37 coordi- 
nates, 89 GBR samples). We limited the simulated region to a rep- 
resentative gene on chromosome 21 (PFKL with coordinates 
45,719,934 and 45,747,259 base pairs) with 143 polymorphic SNPs 
(39 SNPs have MAF <5%, and 17 SNPs have MAF <1%). 

Haplotypes were randomly paired to form AT individuals, where 
N = 1,000, 2,000, 3,000. Causal variants were randomly selected 
such that their individual MAF did not exceed 0.02 and their total 
MAF did not exceed 0.05, as implemented in Asimit et al. [11]. A 
continuous phenotype y ; was assigned to each individual accord- 
ing to a linear model 

yi = Pigfi + "+ Psgts + (?) 

where s is the number of causal variants, gf ly g- s are the genotypes 
of the causal variants for individual and e ; follows a standard 
normal distribution. Each causal variant k = 1, s had an effect fa 
on the phenotype. To examine the type I error, all regression coef- 
ficients fa were set to 0, which translated to the null disease model 
of no association. We simulated scenarios where all causal variants 
were deleterious, and other scenarios where half of the causal vari- 
ants were deleterious and the rest were protective. For the deleteri- 
ous variants, fa was set to 1.2, 2 and 3, whereas for the protective 
variants, fa = -1.2, -2, -3. We also considered a scenario where 
the effect size was not fixed, but it varied with the MAF of the 
causal variant. Following Wu et al. [7], we set the magnitude of 
each fa as \fa\ = 0.4|logioMAF,t| to allow rarer causal variants to 
have larger effects. As the individual MAF cannot exceed 0.02, the 
smallest individual effect is \fa\ = 0.68. Table 1 summarizes the dif- 
ferent parameter settings used for the effect sizes. 

Using this procedure we generated 1,000 replicates under each 
simulation scenario. We subsequently used the simulated data 
with KATE, GRANVIL and SKAT. For each simulation scenario 
where the data have been simulated with a genetic signal (regres- 
sion coefficients fa > 0), the power of each test is calculated as the 
proportion of replicates for which the test yielded a p value of less 
than a 5% significance threshold. For data simulated under the null 
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Table 1 . Effect sizes used in the simulation study for each causal variant k 



Deleterious effects only 1.2 2 3 0.4|logioMAFjJ 

Deleterious and protective effects -1.2,1.2 -2,2 -3,3 -0.4|log 10 MAF t |, 0.4|log 10 MAFil 



In the presence of different directions of effect (second line of the table), the two effect sizes reported are evenly distributed among 
the causal variants. 



disease model of no association, we calculate the false-positive rate 
of each test as the proportion of replicates for which the test yield- 
ed a p value of less than 5%. 



Results 

Table 2 reports the type I error rate together with the 
corresponding 95% confidence intervals for KATE, 
GRANVIL and SKAT. We observe that all methods con- 
trol the type I error rate well at 5%, with perhaps the ex- 
ception of KATE in the scenario where weights are as- 
sumed to have a Beta distribution, where the type I error 
rate is closer to 6%. The 95% confidence interval for a 5% 
error rate is (3.6, 6.4)%, so KATE and GRANVIL are not 
always consistent with a 5% error rate. 

Tables 3 and 4 show the empirical power of the three 
methods for the different sample sizes and under differ- 
ent magnitudes of effect sizes when causal variants are 
assumed to be only risk- increasing and when causal vari- 
ants are also allowed to be protective, respectively. Over- 
all, we observe that SKAT is the most powerful approach 
under any allelic architecture considered. When causal 
variants have the same direction of effect, GRANVIL 
tends to perform better than KATE by a factor of approx- 
imately 16% for weaker effects, and by 1-8% for stronger 
effects. However, the power of GRANVIL sharply de- 
creases when the direction of effect differs among the 
causal variants, while KATE's performance is not affect- 
ed. In this scenario, KATE's power is between 32 and 50% 
higher than that of GRANVIL, depending on the magni- 
tude of the association signal. Moreover, the use of weights 
for KATE increases its power by 6-8%. As expected, pow- 
er increases with increasing sample size and magnitude of 
effect for all methods. The increase in power with sample 
size is more pronounced for KATE. 

These results are based on a single gene region (gene 
PFKL in chromosome 21). This gene is representative of 
chromosome 21 in terms of size and number of variants 
with MAF <5 or <1%, and we therefore believe that our 



Table 2. Type I error rates and their respective confidence intervals 
(in percentage) for a threshold of 5% for the different methods un- 
der the various simulation scenarios considered 



N 


KATE 




GRANVIL 


SKAT 




weights = 1 


weights ~ beta 






1,000 
2,000 
3,000 


5.2 (3.8, 6.6) 
4.9 (3.6, 6.2) 
5.1 (3.7, 6.5) 


5.9 (4.4, 7.4) 
5.9 (4.4, 7.4) 
5.7 (4.3, 7.1) 


5.1 (3.7, 6.5) 

4.6 (3.3, 5.9) 

5.7 (4.3, 7.1) 


4.5 (3.2, 5.8) 
4.9 (3.6, 6.2) 
4.9 (3.6, 6.2) 



Table 3. Power comparisons for the different methods under the 
various effect and sample sizes considered when causal variants 
have the same direction of effect 



KATE GRANVIL SKAT 



weights = 1 weights ~ beta 



N = 1,000 










|beta|=1.2 


0.671 


0.713 


0.794 


1 


|beta|=2.0 


0.935 


0.954 


0.953 


1 


|beta|=3.0 


0.969 


0.974 


0.984 


1 


jbetaj « MAF 


0.320 


0.395 


0.558 


0.971 


N = 2,000 










|beta|=1.2 


0.887 


0.899 


0.937 


1 


|beta|=2.0 


0.986 


0.985 


0.981 


1 


|beta|=3.0 


0.988 


0.991 


0.998 


1 


jbetaj « MAF 


0.568 


0.648 


0.817 


1 


N = 3,000 










|beta|=1.2 


0.958 


0.965 


0.975 


1 


jbetaj=2.0 


0.982 


0.987 


0.996 


1 


jbetaj=3.0 


0.996 


0.998 


0.999 


1 


jbetaj oc MAF 


0.759 


0.799 


0.919 


1 



conclusions are robust to the choice of gene. However, all 
three methods can be implemented genome-wide, which 
would give a more complete picture of their relative 
power. 
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Table 4. Power comparisons for the different methods under the 
various effect and sample sizes considered when there exist both 
protective and risk variants 





KATE 




GRANVIL SKAT 




weights = 1 


weights ~ beta 






N = 1,000 










|beta|=1.2 


0.625 


0.665 


0.220 


0.999 


jbetaj =2.0 


0.921 


0.934 


0.367 


1 


jbetaj =3.0 


0.969 


0.973 


0.480 


1 


jbetaj oc MAF 


0.331 


0.393 


0.147 


0.970 


N = 2,000 










|beta|=1.2 


0.891 


0.909 


0.380 


0.999 


|beta|=2.0 


0.982 


0.983 


0.505 


1 


jbetaj =3.0 


0.991 


0.994 


0.621 


1 


jbetaj oc MAF 


0.577 


0.636 


0.213 


0.998 


N = 3,000 










|beta|=1.2 


0.952 


0.960 


0.426 


1 


jbetaj =2.0 


0.987 


0.990 


0.599 


1 


jbetaj =3.0 


0.988 


0.991 


0.672 


1 


jbetaj oc MAF 


0.769 


0.816 


0.292 


1 



Discussion 

Powerful allele -matching approaches for the analysis 
of rare variants have been proposed [6, 13], but their rel- 
ative power for quantitative traits remains unclear. We 
have presented KATE as a method for low-frequency/ 
rare variants association analysis that exploits the AM 
kernel to model a continuous trait. This method does not 
make assumptions about the directionality of effect and 
is valid when the variants in the region of interest are in 
LD. KATE can be implemented using both common and 
low-frequency/rare variants, which is also true for SKAT, 
but we used only the low- frequency/ rare variants to make 
results comparable to the results from collapsing meth- 
ods. Using an extensive simulation study, we find that 
KATE has similar power to GRANVIL when all causal 
variants are designed to have the same direction of effect. 
In the presence of both deleterious and protective vari- 
ants, the power of GRANVIL sharply decreases, whereas 
KATE's performance is not affected. 

In terms of computational speed, both GRANVIL and 
SKAT are very fast, as they do not require permutation 
testing. In contrast, KATE relies on permutation to assign 
significance. KATE is reasonably fast for a small number 
of samples (N = 1,000), but becomes computational in- 
tensive as the number of samples increases (N = 3,000). 



We used 10,000 permutations to assign significance in the 
simulation study presented here, although an adaptive 
number of permutations is also possible and could reduce 
KATE's running time. In this adaptive design, the default 
number of permutations can be 1,000 and increased to 
10,000 only if the p value is less than a pre-specified 
threshold, as implemented by Asimit et al. [11]. 

It has been shown that the use of weights can increase 
the power of rare variant methods. We illustrate this by 
implementing the weights used in Wu et al. [7], which 
up-weight rare variants. However, any type of weighting 
scheme can be easily implemented with KATE, such as 
weights derived from functional annotation tools. More- 
over, it is straightforward to adjust the model for geno- 
type uncertainty for imputed variants by incorporating 
genotype probabilities in the AM scoring system. Variant 
quality scores available for sequence- derived datasets can 
also easily be adjusted for in KATE in the same way as il- 
lustrated in AMELIA [11]. 

It is straightforward to extend KATE to include covari- 
ates in order to increase the precision of comparisons be- 
tween groups by accounting for variation on important 
prognostic variables, and to adjust comparisons between 
groups for imbalances in important prognostic variables 
between these groups. This can be achieved by replacing 
ANOVA with ANCOVA (analysis of covariance), which 
is a technique that combines analysis of variance and re- 
gression analysis. In this situation, the permutation pro- 
cedure needs to be adapted, so that at each permutation 
and for each pair of individuals we exchange not only 
their phenotypes but also their covariate values. 

In our overall comparison, we find the analytical 
framework within which SKAT operates to yield higher 
power and to control type I error appropriately. 
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